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ABSTRACT 

In  evolutionary  graph  theory  biologists  study  the  problem 
of  determining  the  probability  that  a  small  number 
of  mutants  overtake  a  population  that  is  structured  on  a 
weighted,  possibly  directed  graph.  Currently  Monte  Carlo 
simulations  are  used  for  estimating  such  fixation  probabilities 
on  directed  graphs,  since  no  good  analytical  methods 
exist.  In  this  paper,  we  introduce  a  novel  deterministic  algorithm 
for  computing  fixation  probabilities  for  strongly 
connected  directed,  weighted  evolutionary  graphs  under 
the  case  of  neutral  drift,  which  we  show  to  be  a  lower  bound 
for  the  case  where  the  mutant  is  more  fit  than  the  rest  of  the 
population  (previously,  this  was  only  observed  from  simulation). 

We  also  show  that,  in  neutral  drift,  fixation  probability 
is  additive  under  the  weighted,  directed  case.  We 
implement  our  algorithm  and  show  experimentally  that  it 
consistently  outperforms  Monte  Carlo  simulations  by  several 
orders  of  magnitude,  which  can  allow  researchers  to 
study  fixation  probability  on  much  larger  graphs. 
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ABSTRACT 

In  evolutionary  graph  theory  [1]  biologists  study  the  prob¬ 
lem  of  determining  the  probability  that  a  small  number 
of  mutants  overtake  a  population  that  is  structured  on  a 
weighted,  possibly  directed  graph.  Currently  Monte  Carlo 
simulations  are  used  for  estimating  such  fixation  probabil¬ 
ities  on  directed  graphs,  since  no  good  analytical  methods 
exist.  In  this  paper,  we  introduce  a  novel  deterministic  al¬ 
gorithm  for  computing  fixation  probabilities  for  strongly 
connected  directed,  weighted  evolutionary  graphs  under 
the  case  of  neutral  drift,  which  we  show  to  be  a  lower  bound 
for  the  case  where  the  mutant  is  more  fit  than  the  rest  of  the 
population  (previously,  this  was  only  observed  from  simu¬ 
lation).  We  also  show  that,  in  neutral  drift,  fixation  prob¬ 
ability  is  additive  under  the  weighted,  directed  case.  We 
implement  our  algorithm  and  show  experimentally  that  it 
consistently  outperforms  Monte  Carlo  simulations  by  sev¬ 
eral  orders  of  magnitude,  which  can  allow  researchers  to 
study  fixation  probability  on  much  larger  graphs. 

KEY  WORDS 

Modelling  of  Evolution,  Network  Science,  Stochastic 
Models,  Network  Diffusion 

1  Introduction 

Evolutionary  graph  theory,  introduced  in  [1]  studies  the 
problem  of  a  mutant  overtaking  a  population  whose  under¬ 
lying  structure  is  represented  as  a  directed,  weighted  graph. 
This  model  has  been  applied  to  problems  in  evolutionary 
biology  [2],  physics  [3],  and  game  theory  [4],  Most  work 
with  evolutionary  graphs  concerns  computing  the  fixation 
probability  -  the  probability  that  a  certain  subset  of  mu¬ 
tants  overtakes  the  population.  Although  good  analytical 
approximations  are  available  for  the  undirected  case  [5,  6], 
these  break  down  for  directed,  weighted  graphs  as  shown  in 
[7],  even  in  the  case  of  neutral  drift1 .  As  a  result,  most  work 
dealing  with  evolutionary  graphs  rely  on  Monte  Carlo  sim¬ 
ulations  to  approximate  the  fixation  probability  [8,  9,  10]. 

In  this  paper  we  develop  a  novel  deterministic  algorithm  to 
compute  fixation  probability  in  the  case  of  neutral  drift  in 

'Neutral  drift  is  the  case  where  mutants  and  residents  have  the  same 
fitness. 
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directed,  weighted  evolutionary  graphs  based  on  the  con¬ 
vergence  of  “vertex  probabilities”  to  the  fixation  probabil¬ 
ity  as  time  approaches  infinity.  We  then  prove  that  the  fixa¬ 
tion  probability  computed  in  neutral  drift  is  a  lower  bound 
for  the  fixation  probability  when  the  mutant  is  more  fit 
than  the  resident,  confirming  the  simulation  observations 
of  [11].  We  also  show  that  fixation  probability  under  neu¬ 
tral  drift  is  additive  (even  for  weighted,  directed  graphs), 
which  extends  the  work  of  [6]  which  proved  this  for  undi¬ 
rected  graphs.  Further,  we  implemented  our  algorithm  and 
show  that  it  outperforms  Monte  Carlo  simulations  by  sev¬ 
eral  orders  of  magnitude.  The  paper  is  organized  as  fol¬ 
lows.  In  Section  2  we  review  the  original  model  of  [1]  and 
introduce  the  idea  of  “vertex  probabilities.”  In  Section  3 
we  show  how  vertex  probabilities  can  be  used  to  find  the 
fixation  probability.  This  is  followed  by  our  experimental 
evaluation  in  Section  4.  Finally,  related  work  is  discussed 
in  Section  5. 

2  Technical  Preliminaries 

The  Moran  Process  [12]  is  a  stochastic  process  used  to 
model  evolution  in  a  well-mixed  population.  All  the  in¬ 
dividuals  in  the  population  are  either  mutants  or  residents. 
The  aim  of  such  work  was  to  determine  if  a  set  of  mutants 
cold  take  over  a  population  of  residents.  In  [1],  evolution¬ 
ary  graph  theory  (EGT)  is  introduced,  which  generalizes 
the  model  of  the  Moran  Process  by  specifying  relationships 
between  the  N  individuals  of  the  population  in  the  form  of 
a  directed,  weighted  graph.  We  re-introduce  their  model 
in  this  section.  First,  we  define  an  evolutionary  graph  is 
defined  as  follows. 

Definition  1  (Evolutionary  Graph  (EG)  [1]).  Given  natural 
number  N,  set  of  individuals  V  =  {t>i, . .....  Vjy} 

and  adjacency  matrix  W  =  [vjtj]  s.t.  Vi,  =  1 

and  wu  =  0,  the  tuple  ( N ,  V.  W)  is  an  evolutionary  graph 
(EG). 

Intuitively,  Wij  is  the  weight  of  the  edge  from  vertex  vt  to 
Vj.  In  all  literature  on  evolutionary  graph  theory  known 
to  the  authors,  the  evolutionary  graph  is  assumed  to  be 
strongly  connected  as  defined  below.  We  make  the  same 
assumption. 


Figure  1:  Example  strongly  connected,  directed,  and 
weighted  graph. 

Definition  2  (Strongly  Connected).  An  evolutionary  graph 
( TV ,  V.  W)  is  strongly  connected  if  for  all  distinct  vertices 
Vi,  Vj  £  V,  there  is  a  directed  path  from  Vi  to  Vj  such  that 
for  all  edges  in  the  path  ( va ,  Vb),  wab  >  0. 

As  all  individuals  in  the  population  are  either  mutants  or 
residents,  we  define  a  configuration  of  mutants  and  resi¬ 
dents  in  the  EG. 

Definition  3  (Configuration).  Given  evolutionary  graph 
( TV ,  V,  W ),  a  configuration  C  is  a  subset  ofV  such  that  all 
individuals  in  C  are  mutants  and  all  individuals  in  V  —  C 
are  residents.  The  set  2V  is  the  set  of  all  configurations. 

In  [1],  the  authors  specify  a  stochastic  process  known  as 
the  Moran  Process  on  Graphs  which  we  define  below. 

Definition  4  (Graphical  Moran  Process  (GMP)  [1]).  Given 
evolutionary  graph  (TV,  V,  W)  and  real  number  r  >  0 
(known  as  the  relative  fitness),  the  Graphical  Moran  Pro¬ 
cess  (GMP)  is  specified  as  follows.  For  any  configuration 
C  C  V,  some  vertex  v.t  £  V  is  selected  for  birth.  If  v-i 
is  in  C  it  is  selected  with  a  probability  r_^c\+\v-c\  and 

otherwise.  Then  vertex  v,  is  selected  with 
r-\C\+\V—C\  J 

probability  Wij  for  death  and  replaced  by  a  clone  of  vl. 

Hence,  if  Vi  is  a  mutant,  the  new  configuration  is  C  U  {^j} 

and  C  —  {vj}  otherwise. 

Note  if  r  =  1,  we  say  we  are  in  the  special  case  of  neutral 
drift.  The  following  is  an  example  of  this  process. 

Example  1.  Consider  the  evolutionary  graph  specified  by 
Figure  1,  which  is  directed,  weighted,  and  strongly  con¬ 
nected.  Suppose  vertex  1  in  the  graph  is  a  mutant.  Then, 
one  possible  sequence  that  leads  to  fixation  is  shown  in  Fig¬ 
ure  2  and  has  the  probability  of  2.93  •  10~5  under  neutral 
drift  (r  =  1).  For  instance,  to  transition  from  the  config¬ 
uration  labeled  A  to  the  configuration  labeled  B  (where 
node  2  becomes  a  mutant  as  well)  requires  for  node  1  to 
be  picked  for  birth  (with  a  probability  0.25)  and  node  2 
selected  to  die  (which  occurs  with  a  probability  of  0.5  by 
the  edge  weight).  Hence,  the  transition  to  configuratiaon 
B  from  A  is  0.125. 

Based  on  the  definition  of  GMP,  we  will  use  the  notation 
Pr(C^)  to  refer  to  the  probability  of  being  in  configu¬ 
ration  C  after  t  timesteps  of  the  GMP.  Perhaps  the  most 
widely  studied  problem  in  evolutionary  graph  theory  is  to 
determine  the  fixation  probability.  We  define  it  formally 
below. 


Figure  2:  A  sequence  of  mutant-resident  configurations 
leading  to  fixation  that  happens  with  a  probability  of  2.93  • 
10-5  under  neutral  drift.  Mutant  nodes  are  shaded.  A  is 
the  initial  state  and  F  is  the  final  absorbing  state. 


Definition  5  (Fixation  Probability).  Given  an  evolution¬ 
ary  graph  (TV,  V,  W),  real  number  r  >  0,  and  con¬ 
figuration  C  C  V  the  fixation  probability,  Pc,  is 

limt^ooPriV^CW). 

Intuitively,  this  is  the  probability  that  an  initial  set  C  of 
mutants  takes  over  the  entire  population.  Similarly,  we 
will  use  the  term  the  extinction  probability.  Pc,  to  be 
Umt^ooPrfb^  |C(°)).  If  the  graph  if  strongly  connected, 
we  have  the  following  well-known  result. 

Proposition  1.  Given  EG  (TV,  V,  W),  real  number  r  >  0, 
and  configuration  C  C  V,  if  the  EG  is  strongly  connected, 
then  under  GMP  we  have  Pc  +  Pc  =  1 

The  above  result  essentially  says  that  for  a  strongly  con¬ 
nected  graph,  a  mutant  either  fixates  or  becomes  extinct. 
As  [1]  showed  that  an  optimization  problem  related  to  com¬ 
puting  fixation  probability  is  NP-hard,  much  of  the  work 
on  evolutionary  graph  theory  relies  on  Monte  Carlo  simu¬ 
lations  to  calculate  Pc.  Algorithm  1  shows  pseudo-code 
to  find  the  Pc  for  the  GMP  using  Monte  Carlo  simulations 
for  the  case  of  neutral  drift.  The  time  complexity  is  simply 
0(RTSim),  where  R  is  the  number  of  simulation  runs  exe¬ 
cuted  (loop  at  line  2)  and  TSim  is  the  average  time  it  takes 
for  the  evolutionary  process  to  reach  mutant  extinction  or 
fixation  (the  first  nested  loop  at  line  3). 


Algorithm  1  -  Monte  Carlo  Simulation  to  Compute  Fixa¬ 
tion  Probabilities 

Input:  Evolutionary  Graph  ( N,V,W ),  configuration 
C  c  V,  and  natural  number  R  >  0. 

Output:  Estimate  of  fixation  probability  of  mutant. 

1:  x  -S—  0  {x  will  count  number  of  times  the  mutant  fix¬ 
ates} 

2:  for  *  =  0  — >  R  do 
3:  Set  current  configuration  C*  =  C 

4:  while  While  (C*  £  V)  or  (C*  =  0)  do 

5:  Select  vertex  Vi  £  V  with  probability  1/iV 

6:  Select  vertex  Vj  £  V  with  probability  w,j 

7:  If  Vi  £  C*  set  C*  =  C*  U  {vj}.  Otherwise  set 

C*=C*-{Vj} 

8:  end  while 

9:  If  C*  =  V  then  x  +  4 

10:  end  for 

11:  return  x/R  {the  estimated  fixation  probability} 


Further,  based  on  the  commonly-accepted  definition 
of  estimated  standard  error  from  statistics,  we  can  obtain 
the  estimated  standard  error  for  the  solution  returned  by 
Algorithm  1. 

Proposition  2.  The  estimated  standard  error  of  the  solu¬ 
tion,  S  returned  by  Algorithm  1  is  y J 

This  work  uses  the  idea  of  a  vertex  probilities  to  cre¬ 
ate  an  alternative  to  Algorithm  1.  A  vertex  probability  is 
defined  formally  below. 

Definition  6  (Vertex  Probability).  Given  an  evolutionary 
graph,  (N,  V,  W)  and  time  t  >  0,  the  vertex  proba¬ 
bility  for  Vi  £  V  is  written  Pr(Hfp)  and  defined  as 

£0a-„.„,6CP>-(C(')). 

Hence,  the  vertex  probability  is  the  probability  that  a  given 
vertex  at  a  certain  time  is  a  mutant.  When  we  refer  to  the 
set  vertex  probabilities  of  each  element  of  V  at  some  time 
t,  we  shall  use  the  term  vertex  probability  vector.  Viewing 
the  probability  that  a  specific  vertex  is  a  mutant  at  a  given 
time  has  not,  to  our  knowledge,  been  extensively  studied 
before  with  respect  to  evolutionary  graph  theory  (or  the 
voter  models  in  statistical  physics).  The  key  insight  of  this 
paper  is  that  studying  these  probabilities  sheds  new  light 
on  the  problem  of  calculating  fixation  probabilities.  For 
example,  it  is  easy  to  show  the  following  relationship. 

Proposition  3.  Let  C  be  a  subset  ofV  and  t  be  an  arbitrary 
time  point.  Iff  for  all  Vi  £  C,  Pr (Mb  =  1  and  for  all 
Vi  £  C,  Pr(M[t'>)  =  0,  then  Pr(C W)  =  1  and  for  all 
a  £  2V  s.t.  c  C,  Pr(C '«)  =  0. 

Proof  Sketch.  Suppose,  BWOC,  there  is  some  C'  f 
C  where  Pr{C ’^)  >  0.  Clearly,  there  must  exist  some 
element  Vi  £  C  that  is  not  in  C' .  By  Definition  6,  this 
causes  Pr(l\lfj1'1)  <  1,  which  is  a  contradiction.  Going  the 


other  direction,  we  need  only  to  consider  that  Pr(C^)  is 
in  the  summand  of  each  Pr(IV^t^)  associated  only  with  a 

Vl  £  c.  m 

We  will  also  use  random  variables  S)  '  to  denote  the 
event  that  vertex  v,  was  selected  for  reproduction  and  R,-y 
to  denote  the  event  of  v,  replacing  v:).  We  will  often  use 
conditional  probabilities.  For  example,  Pr(M}4  |C^0^)  is 
the  probability  that  is  a  mutant  given  the  initial  set  C 
of  mutants.  Throughout  this  paper,  unless  noted  other¬ 
wise,  all  of  our  probabilities  will  be  conditioned  on  Cil]> . 
We  will  drop  it  for  ease  of  notation  with  the  understand¬ 
ing  that  some  set  C  of  V  were  mutants  at  t  =  0.  Hence, 
Pr(M^)  =  Pr(M  -,)|C^0^).  It  is  easy  to  verify  that  Pc  >  0 
iff  Vfi  £  V,  limt^roo Pr(M^)  >  0.  Hence,  in  this  pa¬ 
per,  we  shall  generally  assume  that  (zmt_>.oc,Pr(M|f'))  >  0 
holds  for  all  vertices  vt  and  specifically  state  when  it  does 
not.  As  an  aside,  for  a  given  EG,  this  assumption  can  be 
easily  checked  in  in  polynomial  time.  Simply  ensure  for 
Vj  £  V  —  C  that  exists  some  Vj  £  C  s.t.  there  is  a  directed 
path  from  u,  to  Vj. 

3  Directly  Calculating  Fixation  Probability 

Now  that  we  have  introduced  the  model  and  the  idea  of 
vertex  probabilities  we  will  show  how  to  leverage  this  in¬ 
formation  in  our  algorithm  to  compute  fixation  probability. 
First,  we  show  that  as  time  approaches  infinity,  the  vertex 
probabilities  for  all  vertices  converge  to  the  fixation  proba¬ 
bility  when  the  graph  if  strongly  connected. 

Theorem  1.  If  the  graph  is  strongly  connected,  Vi, 

Umt^oaPr{Nt'!))  =  Pc- 

Proof  Sketch.  As  time  approaches  infinity,  there  are  only 
two  possible  configurations  of  mutants  -  sets  0  and  V  corre¬ 
sponding  with  extinction  and  fixation  respectively.  By  Defi¬ 
nition  6,  as  time  approaches  infinity,  the  probability  of  any 
vertex  being  a  mutant  must  be  the  equal  to  the  fixation  prob¬ 
ability.  Hence,  the  statement  follows.  I 
Now  let  us  consider  how  to  calculate  Pr(M,[4  )  for  some  v, 
and  t.  For  t  =  0,  where  we  know  that  we  are  in  the  state 
where  only  vertices  in  a  given  set  are  mutants,  we  need  only 
appeal  to  Proposition  3  -  which  tells  us  that  we  assign  a 
probability  of  1  to  all  elements  in  that  set  and  0  otherwise. 
For  subsequent  timesteps,  we  have  developed  Theorem  2 
shown  next. 

Theorem  2.  Pr(M[f))  equals 

T.(vj,Vi)eE  (wji-Pr^-^.Pri^ \IVff~V) 

-Wji  ■  Pr(IVflt-1))  ■  Pr(S^  IMj*-15))  +Pr(IVflt-1)) 

( S;  is  true  iff  Vi  is  selected  for  reproduction  at  time  t.) 

Proof  Sketch.  Note  we  use  the  variable  R'^  is  true  iff 
Vj  replaces  V{  at  time  t.  First  we  show  that  Pr{ ivfit))  = 


(t)s 


Pr(Nflt~1)  A  A(Vj,Vi)eE  -j 

tf-1))  +  'L(VA.wteEPr&) 


+  E 


(vj  ,Vi)£E 


Pr(S<4)  A  M4)  A 


-iM4^  A  M  L  V>)  by  the  orig¬ 


inal  model.  Then,  by  the  definition  of  conditional  proba¬ 
bility,  we  get  PrV#-*  A  A (Vj,Vi)eE^)  =  iM/WT1*)  • 
(l  -  J2(Vj,Vi)eEPr(ST\ /W!^1>))-  Then’  by  further  manip¬ 


ulating  the  probability  axioms  as  well  as  Bayes  Theorem, 
we  obtain  Pr(  Sj4)  A  ftp  A  /W[4  4))  =  Wji  ■  Pr(IVtJt  1J)  • 
Pr(S^4)|A^t_1))  as  well  as  Pr(S'4)  A  -flE  A  A»f_1))  = 
(1  -  Wjif  Pr(Mlt~1))  ■  ^ ) .  After  showing  all  of 

these  items,  we  obtain  the  statement  of  the  theorem  through 
algebraic  manipulation.  ■ 

Although  finding  Pr(Sf  |Mj4  1^)  may  be  computationally 
intractable  in  practice,  the  good  news  is  that  for  neutral  drift 
(r  =  1),  these  conditional  probabilities  are  trivial  -  specif¬ 
ically,  we  have  Pr(S-,'))  =  1/N  for  all  i  and  this  is  inde¬ 
pendent  of  the  current  set  of  mutants  or  residents  in  the  EG. 
Hence,  we  can  simplify  Pr(M^)  as  follows. 


Corollary  1.  Under  r  =  1,  Pr(IVttl  )  equals 


jf  ]T  wji^Pr^-^-Pr^-^+Pri^) 

( vj  ,Vi)£E 


Studying  evolutionary  graph  theory  under  neutral  drift  was 
a  central  theme  in  work  such  as  [6,  7,  11]  as  it  provides 
an  intuition  on  the  effects  of  network  topology  on  mutant 
spread.  We  shall  focus  on  neutral  drift  in  this  paper  as  well. 
This  special  case  also  allows  us  to  strengthen  the  state¬ 
ment  of  Theorem  1  to  a  necessary  and  sufficient  condition  - 
showing  that  when  the  probabilities  of  all  nodes  are  equal, 
then  we  can  determine  the  fixation  probability. 


by  allows  us  to  apply  Theorem  1  and  obtain  the  statement 
of  this  theorem.  Let  vl  be  the  vertex  that  at  time  t  be¬ 
comes  achieves  the  greatest  increase  in  vertex  probability. 
At  time  t.  The  rest  follows  by  Corollary  1,  and  the  fact  that 
maxj  Pr(Nl-  1})  >  Pr(IVfj  Which  gives  us  the  upper 
bound.  The  second  part  of  the  proof,  which  is  used  for  the 
lower  bound,  mirrors  this  part.  ■ 

Under  neutral  drift,  we  can  show  that  fixation  prob¬ 
ability  is  additive  for  disjoint  sets.  A  similar  result  was 
proved  for  a  special  case  (that  is,  undirected  evolutionary 
graphs)  in  [6],  However,  our  proof  differs  from  theirs  in 
that  we  leverage  Corollary  1 . 

Theorem  5.  When  r  =  1  for  disjoint  sets  C.  I)  C  V,  Pc  + 
Pd  =  Pcud- 

Proof  Sketch.  Consider  some  time  t  and  vertex  vt.  Clearly, 
by  Corollary  1,  Pr(Hfp)  can  be  expressed  as  a  linear  com¬ 
bination  of  the  form  J2V  ■  Pr(l Vffi))  where  Cj  is 

a  coefficient.  We  note  that  these  coefficients  are  the  same 
regardless  of  the  initial  configuration  of  mutants  that  M|f) 
is  conditioned  on.  Hence,  Pr{h/fp is  this  positive 
function  with  Pr(M'->'> )  =  1  if  Vj  £  C  and  0  other¬ 
wise  (see  Proposition  3).  Hence,  for  disjoint  C,D,  for 
any  t>,  €  V,  we  have  Pr(Ntp\C^) +Pr(Ntp  |£>(0))  = 
/MM^KCU  D)(  °)).  The  statement  follows.  ■ 

In  [11],  the  author  observes  experimentally  (through 
simulation)  that  the  fixation  probability  computed  with 
neutral  drift  appears  to  be  a  lower  bound  on  the  fixation 
probability  calculated  with  a  mutant  fitness  r  >  1.  Here, 
we  prove  this  to  be  true  mathematically. 


Theorem  3.  When  r  =  1,  if  for  some  time  t,  Vz,  the  value 
PrilVfp)  is  the  same,  then  =  Pc. 

Proof  Sketch.  Consider  Pr(Mj4))  =  Pr(IVtlt~1))  + 
h'E(vi,Vt)€Ew3i  ■  (Pr(K*_1))  -  Pr(M(it~1) ))  when  for 
t  —  1,  Vz,  j  we  have  Pr(IVt-1  1-))  =  Pr(/W^  1-)).  Clearly, 

in  this  case,  the  value  for  Pr(M ■/'*)  =  Pr(/Wj4  1^).  As  the 
probabilities  of  all  vertices  was  the  same  at  t  —  1,  they  re¬ 
main  so  at  t.  Therefore,  in  this  case,  limt^oo  Pr(M|4))  = 
Pr(Ntp).  By  Theorem  1,  this  equals  Pq.  H 
Therefore,  under  neutral  drift,  we  can  determine  fixation 
probability  when  the  equation  of  Corollary  1  causes  all 
Pr(Mf})’s  to  be  equal.  We  can  also  use  Corollary  1  to 
find  bounds  on  the  fixation  probability  for  some  time  t  by 
the  following  result. 

Theorem  4.  For  any  time  t,  under  neutral  drift  (r  =  1), 
Pc  €  [mini  Pr(Ntfi ),  max,;  Pr(Nfp)\. 

Proof  Sketch.  For  any  time  t,  under  neutral  drift  (r  =  1), 
Pc  A  max,  Pr(IVfp). 

We  show  that  for  each  time  step  t,  max,  Pr(A?f  x))  > 
maxi  Pr(Hfp).  Hence,  by  showing  that,  for  any  time  t! , 
we  have  max.;  Pr(^/tf'>)  >  (zmt-xx,  maxjPr(AI^)  which 


Theorem  6.  For  a  given  set  C,  let  /  ,!  l  >  (C)  be  the  fixation 
probability  under  neutral  drift  and  P (C)  be  the  fixation 
probability  calculated  using  a  mutant  fitness  r  >  1.  Then, 
?W(C)<pW(C). 

Proof  Sketch.  After  introducing  some  notation,2  we  show 
first,  by  the  rules  of  dynamics,  we  show  that  if  a  some  time 
period  t,  the  probability  distribution  over  mutant  configu¬ 
rations  is  I,  mutant  fitness  r,  and  the  transition  functions 

(r)  (r) 

used  to  reach  the  next  time  step  are  x+  \X->  anc l 

2 Proof  Setup.  We  define  an  interpretation,  I  :  2V  —>  [0, 1]  as 
probability  distribution  over  mutant  configurations.  Hence,  for  some  I 
we  have  Ylce 2V  HP)  =  1-  Next,  we  define  a  transition  function  that 
maps  configurations  of  mutants  to  probabilities,  \  :  »  [0, 1]  where 

for  any  C  e  2V .  J2c'e2v  x(C,  Cf)  =  1.  We  will  use  x+  and  X— 
to  indicate  if  the  transition  is  made  with  a  mutant  being  selected  for 
birth  (x+)  or  resident  (x-)-  Hence,  for  some  C  E  V  and  v  ^  C, 
X-(C,CU{v})  =  0  andx+(CU{v},C)  =  0.  Hence,  for  all  C  E  2V , 
1PjC'£2v  (x+(£g  C;)  +  X—  (Pi  C7))  =  1*  If  the  transition  function  is 
based  on  birth-death  and  computed  with  some  r  >  1,  then  we  will  write 
(r)  (r) 

it  as  x_|_  ,  X_  respectively.  If  computed  with  r  =  1,  then  we  write 
X^d) ,  xLnd)  respectively.  For  some  C  E  2C ,  let  inc(C)  be  the  set  of 
all  elements  D  E  2V  s.t.  \D\  >  |C|  and  x+{C,D)  >  0.  For  some 
C  E  2C ,  let  dec(C)  be  the  set  of  all  elements  D  E  2V  s.t.  \D\  <  |C| 
and  X—  (C,  D)  >  0.  Given  set  C  C  V,  we  will  use  to  denote  the 
probability  of  fixation  given  initial  set  of  mutants  C  where  the  value  r  is 
used  to  calculate  all  transition  probabilities. 


Table  1:  Example  of  Algorithm  2  for  Example  2. 


T 

0 

1 

2 

49 

50 

Vl 

1.0 

0.775 

0.629 

0.1865 

0.1864 

V2 

0.0 

0.125 

0.175 

0.1859 

0.1859 

V3 

0.0 

0.000 

0.006 

0.1848 

0.1849 

Vi 

0.0 

0.125 

0.206 

0.1870 

0.1868 

T 

0.5 

0.388 

0.311 

0.0011 

0.0009 

subsequent  transitions  are  computed  using  the  same  dy¬ 
namics  with  neutral  drift,  then  the  fixation  probability  is: 
V(I,r)  = 

E ce2vI(C)  •  (E Demc(C)^;\c,D)  ■  P«)+ 

E  Dedec(C)  (X- J  (C;  D)  ■  P(o)))y  Then,  by  algebraic  manip¬ 
ulation,  we  show  that  for  some  r  <  r',  for  all  C,  D  £  2V , 
we  have  \+\C,D)  <  x+\C,D)  and  x~\C,D)  > 

X-  \C,D).  From  Theorem  5,  we  know  that  given  some 
C  £  2V ,  for  all  pairs  D,  D'  where  D  £  inc(C)  and 
D'  £  dec(C),  we  have  >  P\]) .  We  then  prove  that 
given  interpretation  I,  for  some  r  >  1,  V(I,r )  >  V(1, 0), 
from  which  the  theorem  follows.  I 


3.1  The  Algorithm 

Now  we  have  shown  all  the  necessary  properties  of  ver¬ 
tex  probabilities  to  create  our  algorithm.  Algorithm  2 
shows  pseudo-code  to  compute  the  fixation  probability  us¬ 
ing  our  method.  As  described  above,  our  method  has  found 
the  exact  fixation  probability  when  all  the  probabilities  in 
PrfM1)  (represented  in  the  pseudo-code  as  the  vector  p) 
are  equal.  We  use  Theorem  4  to  provide  a  convergence 
criteria  based  on  value  e,  which  we  can  prove  to  be  the  tol¬ 
erance  for  the  fixation  probability.  Next,  we  show  the  time 
complexity  and  approximation  guarantee  of  the  algorithm, 
which  follows  from  the  results  of  this  section.  We  also  pro¬ 
vide  an  example  of  how  it  works  in  Example  2. 

Proposition  4.  •  Where  Tsoi  is  the  number  of  time  steps 

to  convergence  ( based  on  e)  and  K  is  the  average 
in-degree  of  the  vertices  in  the  evolutionary  graph, 
0(TsoiNK)  is  the  time  complexity  of  Algorithm  2. 

•  Algorithm  2  returns  the  fixation  probability  Pq  within 

±e. 

Example  2.  Consider  the  scenario  introduced  in  Exam¬ 
ple  1.  Suppose  we  decide  to  use  Algorithm  2  to  compute 
the  fixation  probability  with  e  =  0.001.  Table  1  shows  the 
vertex  probability  vector  at  each  time  step,  along  with  the 
value  for  r  from  Algorithm  2.  Hence,  at  50  timesteps,  the 
algorithm  returns  a  fixation  probability  0/O.I86. 


Algorithm  2  -  Our  Novel  Solution  Method  to  Compute 
Fixation  Probabilities 

Input:  Evolutionary  Graph  ( N,V,W ),  configuration 
C  c  V,  natural  number  R  >  0,  and  real  number  e  >  0. 
Output:  Estimate  of  fixation  probability  of  mutant. 

1 :  Pi  is  the  ith  position  in  vector  p  corresponding  with 
vertex  £  V. 

2:  Set  pi  =  1  if  Vi  £  C  and  p,  =  0  otherwise. 

3:  q  p  {q  will  be  p  from  the  previous  time  step.} 

4:  T  1 
5:  while  r  >  f  do 

6:  for  Vi  £  V  do 

7:  sum  ■£-  0 

S:  m  {vj  £  V\ Wji  >  0} 

9:  for  Vj  £  TO  do 

10:  sum  =  sum  +  Wji  ■  (qj  —  q;) 

1 1 :  end  for 

12:  Pi  q;  +  1/N  ■  sum 

13:  end  for 

14:  q  P 

15:  r  <—  (1/2)  •  (maxp  —  minp) 

16:  end  while 

17:  return  (minp)  +  r 


In  the  above  result.  Since  T  depends  on  a  desired  tol¬ 
erance,  we  cannot  compare  our  algorithm’s  performance  to 
monte-carlo  simulation  without  the  right  termination  con¬ 
dition.  Thus  in  our  later  experiments  we  first  find  fixation 
probabilties  using  monte-calro  simulation  and  then  find  the 
number  of  timesteps  T  that  it  takes  our  solution  method  to 
find  a  fixation  probability  within  standard  error  of  the  sim¬ 
ulation  method. 

It  is  easy  to  show  that  the  expected  number  of  mutants 
as  time  approaches  infinity  is  equal  to  Pc  ■  N  -  as  Pq  is  the 
probability  of  being  in  the  state  where  all  the  vertices  are 
mutants  and  N  is  the  population  (as  the  only  other  possi¬ 
ble  state  as  time  approaches  infinity  is  extinction  -  which 
has  no  mutants).  For  any  time  step  t,  the  expected  number 
of  mutants  is  Eu  ev  IMM;*'*)  which  follows  directly  from 
Definition  6.  Hence,  returning  the  average  vertex  proba¬ 
bility  for  a  sufficiently  large  value  of  t  may  also  provide  a 
good  approximation.  We  also  note,  that  as  the  vertex  prob¬ 
abilities  converge,  the  standard  deviation  of  the  p  vector  in 
Algorithm  2  could  be  a  potentially  faster  convergence  cri¬ 
teria.  Note  that  using  standard  deviation  of  p  and  returning 
the  average  vertex  probability  would  no  longer  provide  us 
of  the  guarantee  in  Proposition  4,  however  it  may  provide 
good  results  in  practice.  The  modifications  to  the  algorithm 
would  be  as  follows:  line  15  would  be  r  f-  st.dev(p)  and 
line  17  would  be  return  avg(p).  We  will  refer  to  this  as 
Algorithm  2  with  alternate  convergence  criteria  or  Algo¬ 
rithm  2-ACC  for  short. 


Figure  3:  Convergence  of  the  minimum  (MinP),  maximum 
(MaxP),  and  average  (AvgP)  of  vertex  probabilities  towards 
the  final  fixation  probability  as  a  function  of  our  algorithm’s 
iterations  t  for  a  graph  of  100  nodes. 

4  Experimental  Evaluation 

Our  novel  method  for  computing  fixation  probabilities  on 
strongly  connected  directed  graphs  allows  us  to  compute 
near-exact  fixation  probabilities  within  a  desired  tolerance. 
The  time  complexity  of  our  method  is  highly  dependent 
on  how  fast  the  vertex  probabilities  converge.  In  this  sec¬ 
tion  we  experimentally  evaluate  how  the  vertex  probabil¬ 
ities  in  our  algorithms  converge.  We  also  provide  results 
from  comparison  experiments  to  support  the  claim  that  Al¬ 
gorithm  2-ACC  finds  adequate  fixation  probabilities  order 
of  magnitudes  faster  than  Monte  Carlo  simulations  ( Algo¬ 
rithm  1).  All  algorithms  were  implemented  in  Python  and 
run  on  a  2.33GHz  Intel  Xeon  CPU. 


Figure  4:  Standard  deviation  of  vertex  probabilities  as  a 
function  of  our  algorithm’s  iterations  for  the  same  100  node 
graph  of  Figure  3. 


4.1  Convergence  of  Vertex  Probabilities 

We  ran  our  algorithms  to  compute  fixation  probabilities 
on  randomly  weighted  and  strongly  connected  directed 
graphs  in  order  to  experimentally  evaluate  the  convergence 
of  the  vertex  probabilities.  We  generated  the  graphs  to  be 
scale  free  using  the  standard  preferential  attachment  growth 


model  [13]  and  randomly  assigned  an  initial  mutant  node. 
We  replaced  all  edges  in  the  graph  given  by  the  growth 
model  with  two  directed  edges  and  then  randomly  assigned 
weights  to  all  the  edges. 

Figure  3  shows  the  convergence  of  the  minimum, 
maximum,  and  the  average  of  vertex  probabilities  towards 
the  final  fixation  probability  value  for  a  small  graph  of  100 
nodes.  We  can  observe  that  the  average  converges  to  the  fi¬ 
nal  value  at  a  logarithmic  rate  and  much  faster  than  the  min¬ 
imum  and  maximum  vertex  probability  values.  This  sug¬ 
gests  that  while  Algorithm  2-ACC  does  not  give  the  same 
theoretical  guarantees  as  Algorithm  2,  it  is  much  preferable 
for  speed  since  the  minimum  and  maximum  vertex  proba¬ 
bilities  take  much  longer  to  converge  to  the  final  solution 
than  the  average.  The  fact  that  the  average  of  the  vertex 
probabilities  is  much  preferable  as  a  fast  estimation  of  fix¬ 
ation  probabilities  is  supported  by  the  logarithmic  decrease 
of  the  standard  deviation  of  vertex  probabilities  (see  Fig¬ 
ure  4).  Convergences  for  other  and  larger  graphs  are  not 
shown  here  but  are  qualitatively  similar  to  the  relative  con¬ 
vergences  shown  in  the  provided  graphs. 

4.2  Speed  Comparison  to  Monte  Carlo  Simulation 

In  order  to  compare  our  method’s  speed  compared  to  the 
standard  Monte  Carlo  simulation  method,  we  must  deter¬ 
mine  how  many  iterations  our  algorithm  must  be  run  to 
find  a  fixation  probability  estimate  comparable  to  that  of 
Algorithm  1.  Thankfully,  as  we  have  seen,  we  can  get  a 
standard  error  on  the  fixation  probability  returned  by  Al¬ 
gorithm  1  as  per  Proposition  2.  While  we  did  not  theoreti¬ 
cally  prove  anything  about  how  smoothly  fixation  probabil¬ 
ities  from  our  methods  approach  the  final  solution,  the  con¬ 
vergences  of  the  average  and  standard  deviation  as  shown 
above  strongly  suggest  that  estimates  from  our  method  ap¬ 
proach  the  final  solution  quite  gracefully.  In  fact,  in  the 
following  experiments  once  our  method  has  arrived  at  a 
fixation  probability  estimate  within  the  standard  error  of 
simulations,  the  estimate  never  again  fell  outside  the  win¬ 
dow  of  standard  error  (although  the  estimate  did  not  always 
approach  the  final  estimate  monotonically).  This  is  in  stark 
contrast  to  Monte  Carlo  simulations,  from  which  estima¬ 
tions  can  vary  greatly  before  the  method  has  completed 
enough  single  runs  to  achieve  a  good  probability  estimate. 

We  generated  a  number  of  randomly  weighted  and 
strongly  connected  directed  graphs  of  various  sizes  on 
which  we  compare  our  solution  method  to  Monte  Carlo  ap¬ 
proximation  of  fixation  probabilities.  The  graphs  were  gen¬ 
erated  as  in  our  convergence  experiments.  For  each  graph 
of  a  different  size,  we  generate  a  number  of  different  ini¬ 
tial  mutant  configurations.  We  found  fixation  probabilities 
both  using  Monte  Carlo  estimation  with  2000  simulation 
runs  and  our  direct  solution  method,  terminating  when  we 
have  reached  within  the  standard  error  of  the  Monte  Carlo 
estimation.  Since  the  average  vertex  probability  proved  to 
be  such  a  good  fast  estimate  of  the  true  fixation  probability, 
we  used  Algorithm  2-ACC. 


Table  2:  Experiment  Results.  For  each  graph  size,  shows 
average  number  of  timesteps  needed  for  each  single  simu¬ 
lation  in  the  Monte  Carlo  estimation  to  reach  extinction  or 
fixation  f  Avg  Ts jm),  and  the  average  number  of  iterations 
our  solution  algorithm  must  be  run  to  get  a  fixation  proba¬ 
bility  within  the  standard  error  of  simulations  (  AVG  Tso{). 


Table  3:  Experiment  Results.  For  each  graph  size,  shows 
the  average  seconds  taken  for  the  Monte  Carlo  method 
(Avg  Secs  Sim)  and  our  solution  method  (Avg  Secs  Sol)  to 
get  a  fixation  probability  within  the  standard  error  of  sim¬ 
ulations.  Also  shows  the  average  speedup  our  algorithm 
provides. 


100  19577.49  119.33 

300  190425.88  584.38 

500  857506.55  3240.2 

700  946080.54  152.86 

900  2216117.2  11052.83 


Avg  Secs  Sim  I  Avg  Secs  Sol  I  Avg  Speedup 


100  3876.69  1.12  3461.68 

300  41325.49  14.60  2829.74 

500  206088.44  174.29  1182.48 

700  220996.6  8T8  27025.63 

900  507439.37  790.69  1385347.44 


Tables  2  and  3  show  results  from  these  experiments, 
including  the  average  measured  values  for  TSim  (average 
timesteps  to  extinction  r  fixation  for  simulations),  Tsoi  (av¬ 
erage  timesteps  for  our  solution  method  to  get  within  stan¬ 
dard  error  of  Monte  Carlo  estimation),  and  actual  seconds 
taken  for  each  method  implemented.  Figure  5  shows  the 
speedup  our  solution  provides  over  Monte  Carlo  simula¬ 
tion.  Here  speedup  is  defined  as  the  ratio  of  the  time  it 
takes  for  simulations  to  complete  over  the  time  it  takes  our 
algorithm  to  find  a  fixation  probability  within  the  standard 
deviation. 


graph  size 


Figure  5:  Average  speedup  (on  a  log  scale)  for  finding 
fixation  probabilities  achieved  by  our  algorithm  vs  Monte 
Carlo  simulation  for  graphs  of  different  sizes. 

We  can  observe  from  our  experiments  that  comput¬ 
ing  fixation  probabilities  using  Monte  Carlo  simulations 
showed  to  be  a  very  time-expensive  process,  highlighting 
the  need  for  faster  solution  methods  as  the  one  we  have  pre¬ 
sented.  Especially  for  larger  graph  sizes,  the  time  complex¬ 
ity  of  our  solution  to  achieve  similar  results  to  Monte  Carlo 
simulation  has  shown  to  be  orders  of  magnitude  smaller 
than  the  standard  method. 


5  Related  Work 

While  most  work  dealing  with  evolutionary  graphs  rely  on 
Monte  Carlo  simulation,  there  are  some  good  analytical  ap¬ 
proximations  for  the  undirected  cased  based  on  the  degree 
of  the  vertices  in  question.  In  [5],  the  authors  use  the  mean- 
field  approach  to  create  these  approximations  for  the  undi¬ 
rected  case.  In  [6],  the  authors  derive  an  exact  analytical 
result  for  the  undirected  case  in  neutral  drift,  which  agrees 
with  the  results  of  [5].  They  also  show  that  fixation  prob¬ 
ability  is  additive  in  that  case  (a  result  which  we  extend 
in  this  paper  using  a  different  proof  technique).  However, 
the  results  of  [7]  demonstrate  that  mean-field  approxima¬ 
tions  break  down  in  the  case  of  weighted,  directed  graphs. 
That  work  is  followed  by  [1 1]  which  also  studies  weighted, 
directed  graphs,  but  does  so  by  using  Monte  Carlo  simula¬ 
tion.  In  [8]  the  authors  derive  exact  computation  of  fixation 
probability  through  means  of  linear  programming.  How¬ 
ever,  that  approach  requires  an  exponential  number  of  both 
constraints  and  variables  and  is  intractable.  In  [10],  the  au¬ 
thors  present  a  technique  for  speeding  up  Monte  Carlo  sim¬ 
ulations  by  early  termination.  However,  our  algorithm  dif¬ 
fers  in  that  it  does  not  rely  on  simulation  at  all  and  provides 
a  deterministic  result.  Our  method  totally  avoids  simulation 
and  instead  leverages  properties  of  the  model.  Recently,  in 
[14],  the  authors  study  the  related  problem  of  determining 
the  probability  of  fixation  given  a  single,  randomly  placed 
mutant  in  the  graph  where  the  vertices  are  “islands”  and 
there  are  many  individuals  residing  on  each  island  in  a  well- 
mixed  population.  They  use  quasi-fixed  points  of  ODE’s  to 
obtain  an  approximation  of  the  fixation  probability  and  per¬ 
formed  experiments  with  a  maximum  of  5  islands  (vertices) 
containing  50  individuals  each.  It  is  unclear  if  their  meth¬ 
ods  can  be  applied  to  the  problem  presented  in  this  paper. 
In  [15,  16]  the  authors  present  a  generalized  frameworks  for 
social  network  diffusion.  These  works  primarily  focus  on 
diffusion  models  that  are  monotonic  -  where  the  number  of 
vertices  with  a  certain  property  increases  at  each  time  step. 
While  [15]  does  show  how  to  adjust  their  framework  for 
non-monotonic  models,  they  only  do  so  for  a  finite  number 
of  time  steps  (fixation  probability  is  in  the  limit  of  time). 


6  Conclusion 

In  this  paper,  we  presented  a  novel  deterministic  algo¬ 
rithm  for  quickly  computing  fixation  probability  in  di¬ 
rected,  weighted  strongly  connected  evolutionary  graphs 
under  neutral  drift,  which  we  prove  to  be  a  lower  bound  for 
the  case  where  a  mutant  is  more  fit  than  the  rest  of  the  pop¬ 
ulation.  In  our  experiments,  we  showed  our  approach  to 
outperform  Monte  Carlo  simulations,  which  are  currently 
used  in  most  evolutionary  graph  theory  research,  by  sev¬ 
eral  orders  of  magnitude.  We  also  show  that  under  neutral 
drift,  fixation  probability  is  additive  -  showing  optimal  sub¬ 
structure  in  this  case. 

Our  algorithm  relied  on  the  convergence  of  the  “ver¬ 
tex  probabilities”  -  the  probability  of  an  individual  being  a 
mutant  at  a  certain  time.  As  our  experiments  demonstrated 
that  this  convergence  generally  occurs  rapidly,  it  is  a  tempt¬ 
ing  conjecture  that  the  time  to  convergence  is  polynomial 
in  the  size  of  the  graph.  If  this  is  the  case,  we  suspect  that 
fitness-based  selection  of  the  reproducing  individual  in  the 
network  may  be  a  source  of  complexity  in  this  problem.  A 
more  complete  complexity  analysis  of  evolutionary  graph 
problems  may  still  be  in  order. 

An  important  limitation  of  our  algorithm  is  that  it  is 
limited  to  strongly  connected  graphs.  However,  we  be¬ 
lieve  that  our  algorithm  can  be  used  in  solutions  to  gen¬ 
eral  graphs  by  breaking  the  graph  into  its  strongly  con¬ 
nected  components  and  considering  transition  edges  be¬ 
tween  these  in  the  computation  of  the  overall  fixation  prob¬ 
abilities.  Such  an  extension  is  an  immediate  goal  for  future 
work. 
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